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The most fundamental result in probability theory is the law of large numbers 
for a sequence (X„)„^i of independent and identically distributed real valued 
random variables. Define the empirical mean of (X„)„^i by 



The law of large numbers asserts that the empirical mean Xn converges (almost 
surely) towards the theoretical mean E(Xi) provided that E(|Xi|) is finite. The 
next fundamental results are the central limit theorem and Cramer's theorem. 
Both are refinements of the law of large numbers in two different directions. The 
central limit theorem describes the random fluctuations of X„ around E{Xi). 
Cramer's theorem estimates the probability that X„ deviates significantly from 



Such an event is called a " large deviations" event since it has a very small prob- 
ability: it turns out that this probability decays exponentially fast with n. The 
first estimate of this kind can be traced back to Cramer's paper [■'3] which deals 
with variables possessing a density. In [4] Chernoff relaxed this assumption. 
Then coming from statistical mechanics Lanford imported the subadditivity ar- 
gument in the proof [ ]. Cramer's theory was extended to infinite dimensional 
topological vector spaces by Bahadur and Zabell [2] . The classical texts of Azen- 
cott [ ], Deuschel and Stroock [ ] and Dembo and Zeitouni [(>] take stock of the 
foregoing improvements. At this point in time classical proofs of Cramer's the- 
orem in R resort either to the law of large numbers (see e.g. [(>]) or to Mosco's 
theorem (see e.g. [■]]). We expose here a direct proof of Cramer's theorem in R 
based on convex duality. Not only is the proof shorter, but it can easily adapt 
to a broader setting. 




n 



E(Xi) : 



P{Xn^E{Xi)+e) for£>0. 



Theorem (Cramer). Let (X„)„^i he a sequence of independent and identically 
distributed real valued random variables and let X„ be the empirical mean: 
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For all X gM. the sequence 



converges in [— oo,0] and 



n ^ ' 



lim - logP(X„ > a;) = inf (logEfe^^O - Aa;) 
Let us define the entropy of the sequence by 

Va;eM s(x) =sup^logF(Xri> x) 

and the pressure of Xi (or the log-Laplace transform of the law of Xi) by 

VA e M p{X) = logE(e^^i) 

The entropy and the pressure may take infinite values. Our strategy is to show 
a dual version, in the sense of convex functions, of Cramer's theorem. 

Proposition (dual equality). For all A > 0, 

p{X) = sup (Au + s{u)) 

Proof. The classical Chebychev inequality will yield one part of the proof of 
the above equality. To prove the other part we condition X„ to be bounded by 
K and then let K grow towards +oo. Since Xi, . . . , X„ are independent and 
identically distributed, we have for all A ^ 

Vn>l E(e"^^'') =E(e^^^)" 

Thus, for M e M and n > 1, it follows from Chebychev inequality that 

p(A) = logE(e^^i) = ^logE(e"-^-^") 



> - log fe^^^Pfe"^^" > e"^")) > Au + ^ logP(X„ > u) 
n \ ^ 'I n ^ ' 



n \ " '/ n 

Hence, taking the supremum over n ^ 1 and then over u € M, we get 

VA ^ p(A) ^ sup (Am + s{u)) 
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Next, we prove the converse inequality for A = : for all u e M, 
> s{u) > sup - logP(Xi > u)" = logP(Xi > u) 



Hence, letting u go to — oo, we sec that 



sup s(u) = = p(0) 

Now let A > and K > 0. For all n ^ 1, using the fact that Xi, . . . , X„ are 
independent and identically distributed, we have 

logE (e^^U|^,|^K) = ^ logE (e^^^^+'-'+^-h,;,^!^;^ • • • 1\x^\^k) 

^^logE(e"^^"l|y^l^^) 

the last step being a consequence of Fubini's theorem. Since 
we get 

logE (e^^nix.i^K) < ^ log l^e""^^ + J nAe"(^"+*("»rfwj 

^ - log ( e""-^-^ + 2A:nA exp ( n sup (Xu + s{u)) ) ) 

Let K be large enough so that (recall that the supremum of s is 0) 

—XK < sup (Am + s{u)) 

Sending n to oo we obtain 

logE(e^^U|jf,|^K) ^ sup (Aw + s(m)) 

Eventually sending K to +oo we get 

p{X) ^ sup (Au + s(u)) □ 

To deduce Cramer's theorem from the dual equality we have just proved, we 
need some properties of the function s. First of all it follows from the definition 
of s that s is non-increasing. 
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Proposition. For all x gM. the sequence 

n 

converges in [— oo,0] towards s{x). The function s : M — >■ [— oo,0] is concave. 

Proof. Let x,y gM. with x ^ y. Let also a <e]0, 1[. Suppose that P(Xi ^ y) > 
and let n ^ m ^ 1. Let n = mq + r be the Euclidian division of n by m. On 
the event 

[agj-l r m(k+l) ~| (- m(fe+l) ^ „ 

fc=0 I j=mfe+l ) fc=[a(jj L i=mk+l J i=mq+l 

we have (remember that x ^ y) : 

n 

> [aq\mx + {q- [aq\)my + ry^ n{ax + (1 - a)y) 

i=l 

Therefore 

P(Z„ ^ax + {l- a)y) > P(Z„ > x) ^"''^P(X„ > y)''"L"«Jp(Xi > yf 

Since P(Xi ^ y) > and ^ r < m, taking logarithms, dividing by n and 
sending n to oo, we get for all 1 

liminf — logP(X„ ^ ax + (1 — Oi)y) 
«— >-oo n 

> - logP(X„ ^x) + logP(X„ ^ y) 

\ix = y, taking the supremum over m > 1, we conclude that 

lim - logP(X„ ^ = six) 

\ix <y, sending m to oo, we get 

s{oLX + (1 — Q)y) ^ OLs{x) + (1 — ol)s{\)) 
If y is such that P(Xi > y) = 0, then 

Vn>l ^logP(X„ >y) =-oo 

whence s(y) = — oo and the concavity inequality still holds. □ 

We now finish the proof of Cramer's theorem. At this point we know that for 
all a; e K 

inf (?>(A) — Ax) = inf sup (A(m — x) + s(m)) 



It remains to prove that the latter quantity equals s{x). The result is standard 
in convex functions theory. Let us give an elementary proof in our setting. The 
right-hand side of the previous equation is clearly superior or equal to s{x) : 
take u = X. To prove the converse inequality we set 

c = inf {x e M : P(Xi > x) = O} 

and wc distinguish the three cases x < c, x > c and x = c. 

• Suppose X < c. Since s is concave and non-increasing let — A = s'g{x) ^ be 
the left derivative of s at point x. Then s{x) > — oo and 

Vu e K s{u) < s{x) - X{u - x) 

from which the result follows. 

• Suppose X > c. Then s{x) = — oo and, for all A ^ 0, 

39(A) -Xx = logE(e^(^i-^)) < loge^^'^"^) = A(c- x) 

so the infimum over A ^ is indeed — oo. 

• Suppose X = c. Then, for all A ^ and e > 0, 

p{X) - Ac = logE(e^(^i-^) (ixKc-e + Ic-e^Xi^c)) 

<log(e-^^+P(Xi >c-£)) 
Taking the infimum over A > and sending e to we get 

inf (p(A) - Ax) < logP(Xi > c) < s(c) □ 
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